AITopics | occupancy frequency

Collaborating Authors

occupancy frequency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

72393bd47a35f5b3bee4c609e7bba733-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 20:09:54 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Colorado (0.04)
North America > United States > Arizona (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch

Neural Information Processing SystemsOct-10-2025, 06:04:55 GMT

occupancy frequency, optimal policy, reward function, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Colorado (0.04)
North America > United States > Arizona (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Marek Petrik, Dharmashankar Subramanian

Neural Information Processing SystemsFeb-9-2025, 07:48:39 GMT

aggregation, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Handling Reward Misspecification in the Presence of Expectation Mismatch

Sreedharan, Sarath, Mechergui, Malek

arXiv.org Artificial IntelligenceApr-12-2024

Detecting and handling misspecified objectives, such as reward functions, has been widely recognized as one of the central challenges within the domain of Artificial Intelligence (AI) safety research. However, even with the recognition of the importance of this problem, we are unaware of any works that attempt to provide a clear definition for what constitutes (a) misspecified objectives and (b) successfully resolving such misspecifications. In this work, we use the theory of mind, i.e., the human user's beliefs about the AI agent, as a basis to develop a formal explanatory framework called Expectation Alignment (EAL) to understand the objective misspecification and its causes. Our \EAL\ framework not only acts as an explanatory framework for existing works but also provides us with concrete insights into the limitations of existing methods to handle reward misspecification and novel solution strategies. We use these insights to propose a new interactive algorithm that uses the specified reward to infer potential user expectations about the system behavior. We show how one can efficiently implement this algorithm by mapping the inference problem into linear programs. We evaluate our method on a set of standard Markov Decision Process (MDP) benchmarks.

occupancy frequency, reward function, robot, (14 more...)

arXiv.org Artificial Intelligence

2404.08791

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York (0.04)
North America > United States > Colorado (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Neural Information Processing SystemsMar-13-2024, 09:45:45 GMT

aggregation, performance loss, raam, (15 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Soft-Robust Algorithms for Handling Model Misspecification

Lobo, Elita A., Ghavamzadeh, Mohammad, Petrik, Marek

arXiv.org Machine LearningNov-29-2020

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion, which minimizes the probability of a catastrophic failure. Unfortunately, such policies are typically overly conservative as the percentile criterion is non-convex, difficult to optimize, and ignores the mean performance. To overcome these shortcomings, we study the soft-robust criterion, which uses risk measures to balance the mean and percentile criteria better. In this paper, we establish the soft-robust criterion's fundamental properties, show that it is NP-hard to optimize, and propose and analyze two algorithms to optimize it approximately. Our theoretical analyses and empirical evaluations demonstrate that our algorithms compute much less conservative solutions than the existing approximate methods for optimizing the percentile-criterion.

ambiguity, criterion, soft-robust criterion, (14 more...)

arXiv.org Machine Learning

2011.14495

Country:

North America > United States > New Hampshire (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Petrik, Marek, Subramanian, Dharmashankar

Neural Information Processing SystemsDec-31-2014

We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the gamma-discounted infinite horizon performance loss by a factor of 1/(1-gamma) while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.

aggregation, performance loss, raam, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds

Petrik, Marek

arXiv.org Machine LearningMay-21-2012

Large Markov decision processes (MDPs) are common in reinforcement learning and operations research and are often solved by approximate dynamic programming (ADP). Many ADP algorithms have been developed and studied, often with impressive empirical performance. However, because many ADP methods must be carefully tuned to work well and offer insufficient theoretical guarantees, it is important to develop new methods that have both good theoretical guarantees and empirical performance. Approximate linear programming (ALP)--an ADP method--has been developed with the goal of achieving convergence and good theoretical guarantees (de Farias & van Roy, 2003). Approximate bilinear programming (ABP) improves on the theoretical properties of ALP at the cost of additional computational complexity (Petrik & Zilberstein, 2009, 2011).

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

1205.1782

Country: North America > Canada (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Minimal Sufficient Explanations for Factored Markov Decision Processes

Khan, Omar Zia (University of Waterloo) | Poupart, Pascal (University of Waterloo) | Black, James P. (University of Waterloo)

AAAI ConferencesSep-19-2009

Explaining policies of Markov Decision Processes (MDPs) is complicated due to their probabilistic and sequential nature. We present a technique to explain policies for factored MDP by populating a set of domain-independent templates. We also present a mechanism to determine a minimal set of templates that, viewed together, completely justify the policy. Our explanations can be generated automatically at run-time with no additional effort required from the MDP designer. We demonstrate our technique using the problems of advising undergraduate students in their course selection and assisting people with dementia in completing the task of handwashing. We also evaluate our explanations for course-advising through a user study involving students.

explanation, frequency, template, (15 more...)

AAAI Conferences

Nineteenth International Conference on Automated Planning and Scheduling

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Dementia (0.54)
Education > Educational Setting > Higher Education (0.48)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback